Energy-Effective Instruction Fetch Unit for Wide Issue Processors
نویسندگان
چکیده
Continuing advances in semiconductor technology and demand for higher performance will lead to more powerful, superpipelined and wider issue processors. Instruction caches in such processors will consume a significant fraction of the on-chip energy due to very wide fetch on each cycle. This paper proposes a new energy-effective design of the fetch unit that exploits the fact that not all instructions in a given I-cache fetch line are used due to taken branches. A Fetch Mask Determination unit is proposed to detect which instructions in an I-cache access will actually be used to avoid fetching any of the other instructions. The solution is evaluated for a 4-, 8and 16-wide issue processor in 100nm technology. Results show an average improvement in the Icache Energy-Delay product of 20% for the 8-wide issue processor and 33% for the 16-wide issue processor for the SPEC2000, with no negative impact on performance.
منابع مشابه
Instruction Pre-Processing in Trace Processors
In trace processors, a sequential program is partitioned at run time into “traces.” A trace is an encapsulation of a dynamic sequence of instructions. A processor that uses traces as the unit of sequencing and execution achieves high instruction fetch rates and can support very wide-issue execution engines. We propose a new class of hardware optimizations that transform the instructions within ...
متن کاملAn Exploration Of Instruction Fetch Requirement In Out-of-order Superscalar Processors
Automated design of superscalar processors can provide future in terms a cycles-per-instruction (CPI) using the application program statistics and the 124, Optimization of Instruction Fetch Mechanisms for High Issue Rates 117, A first-order superscalar processor model Karkhanis, Smith 2004 (Show Context). Because superscalar architectures include complicated control logic for out-of-order execu...
متن کاملIncreasing the Instruction Fetch Rate via Block-Structured Instruction Set Architectures - Microarchitecture, 1996., IEEE/ACM International Symposium on
To exploit larger amounts of instruction level parallelism, processors are being built with wider issue widths and larger numbers offunctional units. Instruction fetch rate must also be increased in order to effectively exploit the performance potential of such processors. Block-structured ISAs provide an effective means of increasing the instruction fetch rate. We define an optimization, calle...
متن کاملBlock Based Fetch Engine for Superscalar Processors
The implementation of modern high performance computer is increasingly directed toward parallelism in the hardware. However, most of the current fetch units are limited to one branch prediction per cycle and therefore, can fetch no more than one basic block per cycle. While fetching a single basic block each cycle is sufficient for implementations that issue small number of instructions per cyc...
متن کاملOn Augmenting Trace Cache for High-Bandwidth Value Prediction
Value prediction is a technique that breaks true data dependences by predicting the outcome of an instruction and speculatively executes its data-dependent instructions based on the predicted outcome. As the instruction fetch rate and issue rate of processors increase, the potential data dependences among instructions issued in the same cycle also increase. Value prediction and speculative exec...
متن کامل